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Monte Carlo simulations are one of the major tools in statistical pliysics, complex system science, and other 
fields, and an increasing number of these simulations is run on distributed systems like clusters or grids. This 
raises the issue of generating random numbers in a parallel, distributed environment. In this contribution we 
demonstrate that multiple linear recurrences in finite fields are an ideal method to produce high quality pseu- 
dorandom numbers in sequential and parallel algorithms. Their known weakness (failure of sampling points in 
high dimensions) can be overcome by an appropriate delinearization that preserves all desirable properties of 
the underlying linear sequence. 
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1. Introduction 

The Monte Carlo method is a major industry and random num- 
bers are its key resource. In contrast to most commodities, 
quantity and quality of randomness have an inverse relation: 
The more randomness you consume, the better it has to be Ql. 
The quality issue arises because simulations only approximate 
randomness by generating a stream of deterministic numbers, 
named pseudorandom numbers (PRNs), with successive calls 
to a pseudorandom number generator (PRNG). Considering 
the ever increasing computing power, however, the quality of 
PRNGs is a moving target. Simulations that consume lO'" 
pseudorandom numbers were considered large-scale Monte 
Carlo simulation a few years ago. On a present-day desktop 
machine this is a ten minute run. 

More and more large scale simulations are run on parallel 
systems like networked workstations, clusters, or "the grid." 
In a parallel environment the quality of a PRNG is even more 
important, to some extent because feasible sample sizes are 
easily 10, ... , 10^ times larger than on a sequential machine. 
The main problem is the parallelization of the PRNG itself, 
however Some good generators are not parallelizable at all, 
others lose their efficiency, their quality, or even both when 
being parallelized 01^1]. 

In this contribution we discuss linear recurrences in finite 
fields. They excel as sources of pseudorandomness because 
they have a solid mathematical foundation and they can easily 
be parallelized. Their linear structure, however, may cause 
problems in experiments that sample random points in high 
dimensional space. This is a known issue ("Random numbers 
fall mainly in the planes" 0]), but it can be overcome by a 
proper delinearization of the sequence. 

The paper is organized as follows. In section 121 we briefly 
review some properties of linear recurrences in finite fields. 
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and we motivate their use as PRNGs. In section|3lwe discuss 
various techniques for parallelizing a PRNG. We will see that 
only two of these techniques allow for parallel simulations 
whose results do not necessarily depend on the number of pro- 
cesses, and we will see how linear recurrences support these 
parallelization techniques. Section |3 addresses the problem 
of linear sequences to sample points in high dimensions, mea- 
sured by the spectral test. We show that with a proper trans- 
formation (delinearization) we can generate a new sequence 
that shares all nice properties with the original sequence (like 
parallelizability, equidistribution properties, etc.), but passes 
many statistical tests that are sensitive to the hyperplane struc- 
ture of points of linear sequences in high dimensions. In the 
last two sections we discuss the quality of the linear and non- 
linear sequences in parallel Monte Carlo applications and how 
to implement these generators efficiently. 



2. Recurrent randomness 

Almost all PRNGs produce a sequence (r) = ri , r2 , . . . of pseu- 
dorandom numbers by a recurrence 

ri=f{ri-\,n-2,---,n-k), (1) 

and the art of random number generation lies in the design of 
the function /. Numerous recipes for / have been discussed 
in the literature |^ One of the oldest and most popular 
PRNGs is the linear congruential generator (LCG) 

r; = fl • r,_ 1 + b mod m (2) 

introduced by Lehmer 0]. The additive constant b may 
be zero. Knuth ^ proposed a generalization of Lehmer's 
method known as multiple recursive generator (MRG) 0, 0, 

m 

r,- = fl 1 • r,_ 1 + fl2 • r,-2 + . . . + fl„ • r,_„ mod m . (3) 

Surprisingly, on a computer all recurrences are essentially lin- 
ear This is due to the simple fact that computers operate on 
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numbers of finite precision; let us say integers between and 
'"max- This has two important consequences: 

• Every recurrence Q must be periodic. 

• We can embed the sequence (r) into the finite field F„,, 
the field of integers with arithmetic modulo m, where m 
is a prime number larger than the largest value in the se- 
quence (r). 

The relevance of linear sequences is then based on the follow- 
ing fact (nl corollary 6.2.3]: 

Theorem 1 All finite length sequences and all periodic se- 
quences over a finite field F,„ can be generated by a linear 
recurrence (|3}- 

In the theory of finite fields, a sequence of type (|3} is called 
linear feedback shift register sequence, or LFSR sequence for 
short. Note that a LFSR sequence is fully determined by 
specifying n coefficients (01,02, ... ,«„) plus n initial values 
{ri,r2,...,rn). 

According to Theorem[2all finite length sequences and all 
periodic sequences are LFSR sequences. The linear complex- 
ity of a sequence is the minimum order of a linear recurrence 
that generates this sequence. The Berlekamp-Massey algo- 
rithm fm fl2il is an efficient procedure for finding this short- 
est linear recurrence. The linear complexity of a random se- 
quence of length T on average equals T /2 fl3ll . As a conse- 
quence a random sequence cannot be compressed by coding 
it as a linear recurrence, because T /2 coefficients plus T /2 
initial values have to be specified. Typically the linear com- 
plexity of a nonlinear sequence Q is of the same order of 
magnitude as the period of the sequence. Since the design- 
ers of PRNGs strive for "astronomically" large periods, im- 
plementing nonlinear generators as a linear recurrence is not 
tractable. As a matter of principle, however, all we need to dis- 
cuss are linear recurrences, and any nonlinear recurrence can 
be seen as an efficient implementation of a large order linear 
recurrence. The popular PRNG "Mersenne Twister" fl4ll . for 
example, is an efficient implementation of a linear recurrence 
in F2 of order 19937. 

There is a wealth of rigorous results on LFSR sequences 
that can (and should) be used to construct a good PRNG. Here 
we only discuss a few but important facts without proofs. A 
detailed discussion of LFSR sequences including proofs can 
be found in 

Since the all zero tuple (0,0,..., 0) is a fixed point of 
the maximum period of a LFSR sequence cannot ex- 
ceed m" — 1. The following theorem tells us precisely how to 
choose the coefficients (01,02, ... ,o„) to achieve this period 
& 

Theorem 2 The LFSR sequence over F,„ has period 
m" — 1, if and only if the characteristic polynomial 



is primitive modulo m. 
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A monic polynomial fix) of degree n over F,„ is primitive 
modulo m, if and only if it is irreducible (i. e., cannot be fac- 
torized over F,„), and if it has a primitive element of the ex- 
tension field F„,/i as one of its roots. The number of primitive 
polynomials of degree n modulo m is equal to ^ (m" — 1 ) /n = 
& (m" I (nln(nlnm))) ll20il . where 0(x) denotes Euler's totient 
function. As a consequence a random polynomial of degree 
« is primitive modulo m with probability ~ l/(«ln(nlnm)), 
and finding primitive polynomials reduces to testing whether 
a given polynomial is primitive. The latter can be done effi- 
ciently, if the factorization of m" — 1 is known [1 Ij], and most 
computer algebra systems offer a procedure for this. 

Theorem 3 Let (r) be an LFSR sequence with a primitive 
characteristic polynomial. Then each ^-tuple (r,+i , . . . , r,+<.) e 
{0,l,...,m— 1}*^ occurs m"^^ times per period for k<n (ex- 
cept the all zero tuple for k = n, which never occurs). 

From this theorem it follows that, if a tuple of k successive 
terms A: < « is drawn from a random position of a LFSR se- 
quence, the outcome is uniformly distributed over all possible 
fc-tuples in F„,. This is exactly what one would expect from 
a truly random sequence. In terms of Compagner's ensemble 
theory, tuples of length k<n of a LFSR sequence with primi- 
tive characteristic poly nomial are indistinguishable from truly 
random tuples I2ll 12211 . 

Theorem 4 Let (r) be an LFSR sequence (|3} with period T = 
m" — 1 and let a be a complex mth root of unity and a its 
complex conjugate. Then 

I. \t if/i = Omodr 

Clh) := y a'- ■ a'-+>' = { , (5) 

^ ' f^^ \-\ if/jT^Omodr. 

C(K) can be interpreted as the autocorrelation function of 
the sequence, and Theorem |4] tells us that LFSR sequences 
with maximum period have autocorrelations that are very sim- 
ilar to the autocorrelations of a random sequence with period 
r. Together with the nice equidistribution properties (Theo- 
rem|3} this qualifies LFSR sequences with maximum period as 
pseudonoise sequences, a term originally coined by Golomb 
for binary sequences jTll[l5ll . 



3. Parallelization 



3.1. General parallelization techniques 

In parallel appUcations we need to generate streams tjj of ran- 
dom numbers, where j = 0,1,. — 1 numbers the streams 
for each of the p processes. We require statistical indepen- 
dence of the tjj within each stream and between the streams. 
Four different parallelization techniques are used in practice: 

Random seeding: All processes use the same PRNG but a 
different "random" seed. The hope is that they will generate 
nonoverlapping and uncorrected subsequences of the orig- 
inal PRNG. This hope, however, has no theoretical foun- 
dation. Random seeding is a violation of Donald Knuth's 
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Figure 1 : Parallelization by block splitting. 



advice, "Random number generators should not be chosen 
at random" 0|- 

Parameterization: All processes use the same type of genera- 
tor but with different parameters for each processor. Exam- 
ple: linear congruential generators with additive constant 
bj for the jth stream I23II : 

tj j = a ■ tjj- 1 + bj mod m , (6) 

where the b/s are different prime numbers just below 
y/m/2. Another variant uses different multipliers a for 
different streams fz^. The theoretical foundation of these 
methods is weak, and empirical tests have revealed serious 
correlations between streams I25I1 . On a massive parallel 
system you may need thousands of parallel streams, and it 
is not trivial to find a type of PRNG with thousands of "well 
tested" parameter sets. 

Block splitting: Let M be the maximum number of calls to a 
PRNG by each processor, and let p be the number of pro- 
cesses. Then we can split the sequence (r) of a sequential 
PRNG into consecutive blocks of length M such that 

to,i = n 

h,i = n+M 

(7) 

This method works only if we know M in advance or can at 
least safely estimate an upper bound for M. To apply block 
splitting it is necessary to jump from the ;th random num- 
ber to the (;+M)th number without calculating the num- 
bers in-between, which cannot be done efficiently for many 
PRNGs. A potential disadvantage of this method is that 
long range correlations, usually not observed in sequential 
simulations, may become short range correlations between 
substreams l26l EtIi . Block splitting is illustrated in Fig- 
ure[T] 

Leapfrog: The leapfrog method distributes a sequence (r) of 
random numbers over p processes by decimating this base 
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Figure 2: Parallelization by leapfrogging. 



sequence such that 

tQ.i ~ rpi 
t\,i = rpi+i 

(8) 

Leapfrogging is illustrated in Figure |2l It is the most ver- 
satile and robust method for parallelization and it does not 
require an a priori estimate of how many random numbers 
will be consumed by each processor An efficient imple- 
mentation requires a PRNG that can be modified to gener- 
ate directly only every A:th element of the original sequence. 
Again this excludes many popular PRNGs. 

Note that for a periodic sequence (r) the substreams derived 
from block splitting are cyclic shifts of the original sequence 
(r), and for p not dividing the period of (r), the leapfrog se- 
quences are cyclic shifts of each other Hence the leapfrog 
method is equivalent to block splitting on a different base se- 
quence. 



3.2. Playing fair 

We say that a parallel Monte Carlo simulation plays fair, if its 
outcome is strictly independent of the underlying hardware, 
where strictly means deterministically, not just statistically. 
Fair play implies the use of the same PRNs in the same con- 
text, independently of the number of parallel processes. It is 
mandatory for debugging, especially in parallel environments 
where the number of parallel processes varies from run to run, 
but another benefit of playing fair is even more important: Fair 
play guarantees that the quality of a PRNG with respect to an 
application does not depend on the degree of parallelization. 

Obviously, parametrization or random seeding as discussed 
above prevent a simulation from playing fair Leapfrog and 
block splitting, on the other hand, do allow one to use the same 
PRNs within the same context independently of the number of 
parallel streams. 

Consider the site percolation problem. A site in a lattice of 
size is occupied with some probability, and the occupancy is 
determined by a PRN. M random configurations are generated. 
A naive parallel simulation on p processes could split a base 
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sequence into p leapfrog streams and having each process gen- 
erate K,M / p lattice configurations, independently of the other 
processes. Obviously this parallel simulation is not equivalent 
to its sequential version that consumes PRNs from the base se- 
quence to generate one lattice configuration after another The 
effective shape of the resulting lattice configurations depends 
on the number of processes. This parallel algorithm does not 
play fair. 

We can turn the site percolation simulation into a fair play- 
ing algorithm by leapfrogging on the level of lattice configura- 
tions. Here each process consumes distinct contiguous blocks 
of PRNs from the sequence (r), and the workload is spread 
over p processors in such a way, that each process analyzes 
each pth lattice. If we number the processes by their rank / 
from to /;> — 1 and the lattices from to M — 1, each process 
starts with a lattice whose number equals its own rank. That 
means process / has to skip / • PRNs before the first lattice 
configuration is generated. Thereafter each process can skip 
p — \ lattices, i.e., {p—\)-N PRNs and continue with the next 
lattice. 

Organizing simulation algorithms such that they play fair is 
not always as easy as in the above example, but with a little 
effort one can achieve fair play in more complicated situations, 
too. This may require the combination of block splitting and 
the leapfrog method, or iterated leapfrogging. Sometimes it 
is also necessary to use more than one stream of PRNs per 
process, e. g., in the Swendsen Wang cluster algorithm one 
may use one PRNG to construct the bond percolation clusters 
and another PRNG to decide to flip the cluster 



3.3. Parallelization of LFSR sequences 

As a matter of fact, LFSR sequences do support leapfrog and 
block splitting very well. Block splitting means basically 
jumping ahead in a PRN sequence. In the case of LFSR se- 
quences this can be done quite efficiently. Note, that by intro- 
ducing a companion matrix A the linear recurrence can be 
written as a vector matrix product |28]. 



Implementing leapfrogging efficiently is less straightfor- 
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From this formula it follows immediately that the M-fold suc- 
cessive iteration of (|3} may be written as 
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(10) 



Matrix exponentiation can be accomplished in i^'(n^lnM) 
steps via binary exponentiation (also known as exponentiation 
by squaring). 



ward. Calculating , 
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is no option, because A'' is usually a dense matrix, in which 
case calculating a new element from the leapfrog sequence 
requires (J'{n'^) operations instead of ^(«) operations in the 
base sequence. 

The following theorem assures that the leapfrog subse- 
quences of LFSR sequences are again LFSR sequences fllll . 
This will provide us with a very efficient way to generate 
leapfrog sequences. 

Theorem 5 Let (r) be a LFSR sequence based on a primitive 
polynomial of degree n with period m" — 1 (pseudonoise se- 
quence) over F„„ and let (f ) be the decimated sequence with 
lag p > and offset j, e. g.. 



rpi+j ■ 



(12) 



Then [tj) is a LFSR sequence based on a primitive polynomial 
of degree «, too, if and only if p and m" — 1 are coprime, e. g., 
greatest common divisor gcd(OT" — 1,/?) = L In addition, (r) 
and [tj) are not just cychc shifts of each other, except when 



(13) 



for some <h <n. If gcd(m" — 1 ,/?) > 1 the sequence (tj) is 
still a LFSR sequence, but not a pseudonoise sequence. 

It is not hard to find prime numbers m such that m" — 1 has 
very few (and large) prime factors. For such numbers, the 
leapfrog method yields pseudonoise sequences for any reason- 
able number of parallel streams, see also section l34l and ap- 
pendixlAl 

While Theorem |5] ensures that leapfrog sequences are not 
just cyclic shifts of the base sequence (unless (I13> holds), the 
leapfrog sequences are cyclic shifts of each other. This is true 
for general sequences, not just LFSR sequences. Consider an 
arbitrary sequence (r) with period T. If gcd(r,p) = 1, all 
leapfrog sequences (fi), (f2), • • • , (tp) are cyclic shifts of each 
other, i. e. for every pair of leapfrog sequences (f ) and (f ) 
of a common base sequence (r) with period T there is a con- 
stant s, such that tj^j — tj^,i+s for all i, and s is at least \ T /p\ . 
Furthermore, if gcd(r,p) ~d> I, the period of each leapfrog 
sequence equals T /d and there are d classes of leapfrog se- 
quences. Within a class of leapfrog sequences there are p/d 
sequences, each sequence is just a cyclic shift of another and 
the size of the shift is at least YT / p\ . 

Theorem |5] tells us that all leapfrog sequences of a LFSR 
sequence of degree n can be generated by another LFSR of 
degree n or less. The following theorem [11, Theorem 6.6.1] 
gives us a recipe to calculate the coefficients {bi,b2, ■ ■ ■ ,b„) of 
the corresponding leapfrog feedback polynomial: 
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Theorem 6 Let (f ) be a (periodic) LFSR sequence over the 
field F„, and f{x) its characteristic polynomial of degree n. 
Then the coefficients {bi,b2, ■ ■ ■ ,b„) of f{x) can be computed 
from 2n successive elements of (f) by solving the linear sys- 
tem 
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Starting from the base sequence we determine 2n values of 
the sequence (f ) by applying the leapfrog rule. Then we solve 
J14> by Gaussian elimination to get the characteristic polyno- 
mial for a new LFSR generator that yields the elements of the 
leapfrog sequence directly with each call. If the matrix in ( I14t 
is singular, the linear system has more than one solution, and 
it is sufficient to pick one of them. In this case it is always pos- 
sible to generate the leapfrog sequence by a LFSR of degree 
less than the degree of the original sequence. 



3.4. Choice of modulus 

LFSR sequences over F? with sparse feedback polynomials 
are popular sources of PRNs LS»2£,,2QJ and generators of this 
type can be found in various software libraries. This is due 
to the fact that multiplication in F2 is trivial, addition reduces 
to exclusive-or and the mod -operation comes for free. As a 
result, generators that operate in F2 are extremely fast. Un- 
fortunately, these generators suffer from serious statistical de- 
fects I30H2J that can be blamed quite generally on the small 
size of the underlying field l34ll . see also section l6n In par- 
allel applications we have the additional drawback, that, if the 
leapfrog method is applied to a LFSR sequence with sparse 
characteristic polynomial, the new sequence will have a dense 
polynomial. The computational complexity of generating val- 
ues of the LFSR sequence grows from to ^(«). Re- 
member that for generators in F2, n is typically of order 1000 
or even larger to get a long period 2" — I. 

The theorems and parallelization techniques we have pre- 
sented so far do apply to LFSR sequences over any finite field 
F„,. Therefore we are free to choose the prime modulus m. In 
order to get maximum entropy on the macrostate level iH m 
should be as large as possible. A good choice is to set m to 
a value that is of the order of the largest representable inte- 
ger of the computer. If the computer deals with e-bit registers, 
we may write the modulus as m = 2'^ — k, with k reasonably 
small. In fact if k{k + 2) < m modular reduction can be done 
reasonably fast by a few bit-shifts, additions, and multiplica- 
tions, see section lsTI Furthermore a large modulus allows us 
to restrict the degree of the LFSR to rather small values, e. g., 
n Ri 4, while the PRNG has a large period and good statistical 
properties. 

In accordance with Theorem |5] a leapfrog sequence of a 
pseudonoise sequence is a pseudonoise sequence, too, if and 



only if its period m" — 1 and the lag p are coprime. For that 
reason m" — I should have a small number of prime factors 
ll36ll . It can be shown that m" — 1 has at least three prime fac- 
tors and if the number of prime factors does not exceed three, 
then m is necessarily a Sophie-Germain prime and n a prime 
larger than two, see appendixElfor details. 

To sum up, the modulus m of a LFSR sequence should be 
a Sophie-Germain prime, such that m" — 1 has no more than 
three prime factors and such that m — 2"^ — k and k{k + 2) <m 
for some integers e and k. 



4. Delinearization 

LFSR sequences over prime fields with a large prime modu- 
lus seem to be ideally suited as PRNGs, especially in paral- 
lel environments. They have, however, a well known weak- 
ness. When used to sample coordinates in li-dimensional 
space, pseudonoise sequences cover every point for d <n, and 
every point except (0,0, ... ,0) for d = n. For d> yi the set of 
positions generated is obviously sparse, and the linearity of 
the production rule (|3} leads to the concentration of the sam- 
pling points on «-dimensional hyperplanes IjtIIs^I . see also 
top of Figure|3] This phenomenon, first noticed by Marsaglia 
in 1968 0], motivates one of the well known tests for PRNGs, 
the so-called spectral test The spectral test checks the 
behavior of a generator when its outputs are used to form d- 
tuples. LFSR sequences can fail the spectral test in dimen- 
sions larger than the register size «. Closely related to this 
mechanism are the observed correlations in other empirical 
tests such as the birthday spacings test and the collision test 
I3^l40ll . Nonlinear generators do quite well in all these tests, 
but compared to LFSR sequences they have much less nice 
and provable properties and they are not suited for fair play- 
ing parallelization. 

To get the best of both worlds we propose a delinearization 
that preserves all the nice properties of linear pseudonoise se- 
quences: 

Theorem 7 Let [q] be a pseudonoise sequence in F„,, and 
let § be a generating element of the multiplicative group F*,. 
Then the sequence (r) with 



g'-^' mod m if qt > 







is a pseudonoise sequence, too. 



if^,=0 



(15) 



The proof of this theorem is trivial: since § is a generator of 
F*,, the map M5\ is bijective. We call delinearized generators 
based on Theorem YARN generators (yet another random 
number). 

The linearity is completely destroyed by the map M5\ . see 
Figure|3] Let Li^r){l) denote the linear complexity of the sub- 
sequence (ri,r2, . . . This function is known as the lin- 
ear complexity profile of (r). For a truly random sequence it 
grows on average like 1/2. Figure|4]shows the linear complex- 
ity profile of a typical YARN sequence. It shows the 
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Figure 3: Exponentiation of a generating element in a prime field 
is an effective way to destroy the linear structures of LFSR se- 
quences. Both pictures show the full pehod of the generator. Top: 
;•/ = 95 • r,_, mod 1999. Bottom: r, = 1099* mod 1999 with g, = 
95 •^/_, mod 1999. 



same growth rate as a truly random sequence up to the point 
where more than 99 % of the period has been considered. Shar- 
ing the linear complexity profile with a truly random sequence, 
we may say that the YARN generator is as nonlinear as it can 
get. 



5. Implementation and efficiency 

LFSR sequences over prime fields larger than F2 have been 
proposed as PRNGs in the literature 11113 [s^]. An efficient 
implementation of these recurrences requires some care, how- 
ever 

We assume that all integer arithmetic is done in w-bit regis- 
ters and m < 2"^ ' . Under this condition addition modulo m 




0.2 0.4 0.6 0.8 1 1.2 

l/T 



Figure 4: Linear complexity profile L(^r){l) of a YARN sequence, pro- 
duced by the recurrence g,- = 173-^,_i +219-t;,_2 mod 317 and r,- = 
151* mod 317. The penod of this sequence equals T = 317^ — 1. 

can be done without overflow problems. But multiplying two 
(w— l)-bit integers modulo m is not straightforward because 
the intermediate product has 2 ( w — 1 ) significant bits and can- 
not be stored in a w-bit register. For the special case a/t < ^/m 
Schrage llilll showed how to calculate a,;. • mod m without 
overflow. Based on this technique a portable implementation 
of LFSR sequences with coefficients a^. < is presented 
in fioll . For parallel PRNGs these methods do not apply be- 
cause the leapfrog method may yield coefficients that violate 
this condition. Knuth j5, section 3.2. LI] proposed a general- 
ization of Schrage's method for arbitrary positive factors less 
than m, but this method requires up to twelve multiplications 
and divisions and is therefore not very efficient. 

The only way to implement ({3} without additional measures 
to circumvent overflow problems is to restrict m to m < T'"/'^. 
On machines with 32-bit registers, 16 random bits per num- 
ber is not enough for some applications. Fortunately todays 
C compilers provide fast 64-bit arithmetic even on 32-bit 
CPUs and genuine 64-bit CPUs become more and more com- 
mon. This allows us to increase m to 32. 



5.1. Efficient modular reduction 

Since the modulo operation in ({5} is usually slower than other 
integer operations such as addition, multiplication, boolean 
operations or shifting, it has a significant impact on the to- 
tal performance of PRNGs based on LFSR sequences. If the 
modulus is a Mersenne prime m = 2^ — 1, however, the mod- 
ulo operation can be done using only a few additions, boolean 
operations and shift operations ■ 

A summand i = at • in ({3} will never exceed {m—iy = 
{7f — 2)^ and for each positive integer s G [0, (2'" — 1)^] there 
is a unique decomposition of s into 

s^r-2' + q with < ^ < 2^ (16) 
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From this decomposition we conclude 

s — r-2'' = q 
s-ril' -\)=q + r 
smoA (2"-l)=^ + rmod {l' -I) 

and r and q are bounded from above by 

q<2' and r< \{2' - if IT\ <T -1 

respectively, and therefore 

^ + r<2^ + 2"-2 = 2m. 

So if m = 2" — \ and s < (in—\f,x^s mod m can be calcu- 
lated solely by shift operations, boolean operations and addi- 
tion, viz 

X = mod 2*^) + [V2'J . (17) 

If (I17> yields a value x > m we simply subtract m. 

From a computational point of view Mersenne prime mod- 
uli are optimal and we propose to choose the modulus m = 
2^' — 1. This is the largest positive integer that can be rep- 
resented by a signed 32-bit integer variable, and it is also a 
Mersenne prime. On the other hand our theoretical consid- 
erations favor Sophie-Germain prime moduli, for which ( I17> 
does n ot ap ply directly. But one can generalize illl to moduli 
2" — k 14311 . Again we start from a decomposition of s into 

s = r-2'' + q with < ^ < 2^ (18) 

and conclude 

s-r-2'' = q 
s-r{2' ~k)=q + kr 
s mod (2'" — A:) = ^ + kr mod [2^ — k) . 

The sum s' = q + kr exceeds the modulus at most by a factor 
k+\, because by applying 

q<2'' and r < [{2" -k - \f /2' \ <2' ~k~ I , 

we get the bound 

q + kr <2'' +k{2' -k-\) = {k+\)m. 

In addition, by the decomposition of s' = q + kr 

s' = r'-2' + q' with < ^' < 2% 

it follows 

s mod {2' -k)= s' mod {2" -k)=q' + kr' mod {2" - k) , 
and this time the bounds 

q'<2' and r' < l{k+ 1){2' ~ k)/2' \ < k+ I 

and 

q' + k/ < 2' + k{k+\)^m + k{k + 2) 



Table I: Generators of the TRNG library. All PRNGs denoted by 
trng::mrg;i{s} are pseudonoise sequences over F„, with /; feedback 
terms, trng::yarnn{s} denotes its delinearized counterpart. The 
prime modulus 2^^' — 1 is a Mersenne prime, whereas all other moduli 
are Sophie-Germain primes. 



name 


m 




period 


trng::mrg2, tmg::yarn2 


231 


- 1 




tmg::mrg3, trng::yam3 


231 


- I 


«.2''3 


tmg::mrg3s, trng::yarn3s 


2^1 


-21069 


«.2''3 


tmg::mrg4, trng::yarn4 


231 


- 1 


P. 2124 


tmg::mrg5, trng::yam5 


231 


- 1 


«.2l55 


trng::mrg5s, trng::yarn5s 


231 


-22641 


R.2155 



hold. Therefore if m = 2' -k,s< [m - kf and k{k + 2) < 
m, X = s mod m can be calculated solely by shift operations, 
boolean operations and addition, viz 

s' = (s mod 2')+k\sl2''\ 

(19) 

x={s' moA2'')+k[s' /2'\. 

If ( I19> yields a value x > m, a single subtraction of m will 
complete the modular reduction. To carry out (I19> twice as 
many operations as for ( I17> are needed. But il9\ applies for 
all moduli m = 2^ — A: with k{k + 2) < m. Note that on systems 
with big enough word size (such as 64-bit architectures), just 
doing the mod -operation might well be faster than using (I19> . 



5.2. Fast delinearization 

YARN generators hide linear structures of LFSR sequences 
(q) by raising a generating element g to the power mod m. 
This can be done efficiently by binary exponentiation, which 
takes (logm) steps. But considering LFSR sequences with 
only a few feedback taps (« < 6) and m m 2^^ even fast expo- 
nentiation is significantly more expensive than a single itera- 
tion of (|3}- Therefore we propose to implement exponentia- 
tion by table lookup. If m is a 2e'-bit number we apply the 
decomposition 

qi = qi,i-2''' +qi,o with 
qi,i = [qi/2' J , qix) = qj mod 2' , 

and use the identity 

n = mod m = {g^"' i • gi- ° mod m (21) 

to calculate g''' mod m by two table lookups and one multipli- 
cation modulo m. If m < 2^^ the tables for {g^' )*■' mod m 
and gi'-^ mod m have 2'^ and 2'^ entries, respectively. These 
384 kbytes of data easily fit into the cache of modern CPUs. 
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Table II: Performance of various PRNGs from the TRNG library ver- 
sion 4.0 01 and the GNU Scientific Library (GSL) version 1.8 IH. 
The test program was compiled using the Intel G++ compiler ver- 
sion 9.1 with high optimization (option -03) and was executed on 
a Pentium IV 3 GHz. 



TRNG GSL 



generator 


PRNs per sec. 


generator 


PRNs per sec. 


trng::mrg2 


48 • lO* 


mt 19937° 


41 • 10* 


trng::mrg3 


43 • 10* 


r250'' 


85 ■ 10* 


tmg::mrg3s 


36- 10* 


gfsr4'^ 


77- 10* 


tmg::mrg4 


38- 10* 


ran2'' 


27- 10* 


tmg::mrg5 


38- 10* 


ranlux" 


8-10* 


tmg::mrg5s 


23 • 10* 


ranlux389/' 


5-10* 


tmg::yarn2 


26- 10* 






tmg::yarn3 


23 • 10* 






tmg::yarn3s 


16- 10* 






tmg::yarn4 


20- 10* 






tmg::yarn5 


22- 10* 






tmg::yarn5s 


13-10* 







"Mersenne Twister |14|| 

''sliift-register sequence over F2, r, = /"/-los ffi '■i-250 I29ll . see also sec- 
tionl6.1l 

Shift-register sequence over F2, r, = r,_47i ® r,_i586® rz-egssffi '■|-9689 031 
''Numerical Recipes generator ran2 |46|| 
■"RANLUX generator with default luxury level I47j 
^RANLUX generator with highest luxury level |47|| 

5.3. TRNG 

LFSR sequences and their nonlinear counterparts have been 
implemented in the TRNG software package, a portable and 
highly optimized library of parallelizable PRNGs. Paralleliza- 
tion by block splitting as well as by leapfrogging are sup- 
ported by all generators. TRNG is publicly available for 
download (a^- Its design is based on a proposal for the 
next revision of the ISO C++ standard TRNG uses 

64-bit-arithmetic, fast modular reduction and ( I19> and 
exponentiation by table lookup ( I2U to implement PRNGs 
based on LFSR sequences over prime fields, with Mersenne 
or Sophie-Germain prime modulus. Table|I]describes the gen- 
erators of the software package. 

Tableinishows some benchmark results. Apparently the per- 
formance of both types of PRNGs competes well with popu- 
lar PRNGs such as the Mersenne Twister or RANLUX. Abso- 
lute as well as relative timings in Table lUdepend on compiler 
and CPU architecture, but the table gives a rough performance 
measure of our PRNGs. 



6. Quality 

It is not possible to prove that a given PRNG will work well 
in any simulation, but one can subject a PRNG to a battery of 
tests that mimic typical applications. One distinguishes empir- 



ical and theoretical test procedures. Empirical tests take a fi- 
nite sequence of PRNs and compute certain statistics to judge 
the generator as "random" or not. While empirical tests focus 
only on the statistical properties of a finite stream of PRNs 
and ignore all the details of the underlying PRNG algorithm, 
theoretical tests analyze the PRNG algorithm itself by number- 
theoretic methods and establish a priori characteristics of the 
PRN sequence. These a priori characteristics may be used 
to choose good parameter sets for certain classes of PRNGs. 
The parameters of the LFSR sequences used in TRNG for ex- 
ample were found by extensive £omputer search Is^ to give 
good results in the spectral test |5i]. 

In this section we apply statistical tests to the generators 
shown in Table|I]to investigate their statistical properties and 
those of its leapfrog substreams. The test procedures are mo- 
tivated by problems that occur frequently in physics, and that 
are known as sensitive tests for PRNGs, namely the cluster 
Monte Carlo simulation of the two-dimensional Ising model 
and random walks. Furthermore we analyze the effect of the 
exponentiation step of the YARN generator in the birthday 
spacings test. Results of additional empirical tests are pre- 
sented in the documentation of the software library TRNG 

Q. 



6.1 . Cluster monte carlo simulations 

In I31I1 it is reported, that some "high quality" generators pro- 
duce systematically wrong results in a simulation of the two- 
dimensional Ising model at the critical temperature of the in- 
finite system by the Wolff cluster flipping algorithm 14^ . De- 
viations are due to the bad mixing properties of the generators 
(low macrostate entropy) l3^ . An infamous example for such 
a bad PRNG is the r250 generator l^. It combines e LFSR 
sequences over F2 to produce a stream of e-bit integers via 

n = n-wi'S)n-25o, (22) 

where denotes the bitwise exclusive-or operation (addition 
in F2). 

Figure |5] shows the results of cluster Monte Carlo simula- 
tions of the Ising model on a 16 x 16 square lattice with cyclic 
boundary conditions for the generator r250 and its leapfrog 
substreams. Each simulation was done ten times at the critical 
temperature applying the Wolff cluster algorithm. The mean 
of the internal energy ^mc, the specific heat cmc, and the em- 
pirical standard deviation of both quantities have been mea- 
sured. The gr ay s cale in Figure |5]codes the deviation from the 
exact values 150]. Black bars indicate deviations larger than 
four standard deviations from the exact values. The original 
r250 sequence and its leapfrog sequences yield bad results if 
the number of processes is a power of two. Since the statis- 
tical error decreases as the number of Monte Carlo updates 
increases, these systematic deviations become more and more 
significant and a PRNG with bad statistical properties will re- 
veal its defects. Figure |5] visualizes the inverse relation be- 
tween quantity and quality of streams of PRNs mentioned in 
the introduction. 
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Figure 5: Results of Monte Carlo simulations of the two-dimensional 
Ising model at the critical temperature using the Wolff cluster flipping 
algorithm and the generator r250 (top) and trng::mrg3s (bottom). Only 
each pth random number of the sequence was used during the simu- 
lations. See the text for details. 



The apparent dependency of the quaUty on the leapfrog pa- 
rameter p is easily understood: If /? is a power of two, the 
leapfrog sequence is just a shifted version of the original se- 
quence fisl '3^. On the other hand, if p is not a power of 
two, the leapfrog sequence corresponds to a LFSR sequence 
with a denser characteristic polynomial. For p = 3, e. g., the 
recursion reads 

n = r,-_i03 ® r,-_i52 ® r/_201 © ?"i-250 , 

and for p = 7, 

n = n_ 103 ® 124 ® 145 © '"i- 166 © '"i- 187 © 
'"i-208 © '"(■-229 © '",■-250 , 

respectively. The quality of a LFSR generator usually in- 
creases with the number of nonzero coefficients in its charac- 
teristic polynomial Ift ,51,,52J. a phenomenon that can be eas- 
ily explained theoretically 13411 . and that also holds for LFSR 
sequences over general prime fields jssi . 

While LFSR sequences over F2 with sparse characteristic 
polynomial are known to fail in cluster Monte Carlo simula- 
tions, LFSR sequences over large prime fields with dense char- 
acteristic polynomial perform very well in these tests. This 
can be seen in the lower half of Figure|5]for the trng::mrg3s 
generator; other generators from Table Uperform likewise. 
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Figure 6: The number of sites visited by A' random walkers after t 
time steps is 5iv(f) ~ f^. The plot shows the exponent 7 against time 
for generator trng::yarn3. 
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Figure 7: Results of the Sn test for generator trng::mrg5; correspond- 
ing results for other generators in TableQlook similar. See the text for 
details. 



6.2. Random walks 

The simulation of the two-dimensional Ising model in the pre- 
vious section measured the quality within distinct substreams. 
Potential interstream correlations are not taken into account. 
In issll some tests are proposed that are designed to detect cor- 
relations between substreams. One of them is the so-called 
Sn test. 

For the Sn test random walkers are considered. They 
move simultaneously and independently on the one-dimen- 
sional lattice, i. e., on the set of all integers. At each time step 
t each walker moves one step to the left or to the right with 
equal probability. For large times t the expected number SN{t) 
of sites that have been visited by at least one walker reaches 
the asymptotic form 5iv(f ) ^ f ^ with 7=1/2. In the 5a? test 
the exponent 7 is measured and compared to the asymptotic 
value 7= 1/2. 

Let p be the number of parallel streams or walkers. The 
motion of each random walker is determined by a leapfrog 
stream f, y based on the generators from Table|I] The exponent 
7 is determined for up to 16 simultaneous random walkers as 
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a function of time t. After about 1000 time steps the exponent 
has converged and fluctuates around its mean value. 

Figure|6lshows some typical plots of the exponent 7 against 
time t for generator trng::yarn3. The exponent 7 was deter- 
mined by averaging over 10^ random walks. Results for the 
other generators in Table |I]look similar and are omitted here. 
Figure Qpresents the asymptotic values of 7 as a function of 
the number of walkers for generator trng::mrg5. The mean 
value of 7 averaged over the interval 1500 < t < 2500 is plot- 
ted and the error bars indicate the maximum and the minimum 
value of 7 in 1500 < t < 2500. Almost all numerical results 
are in good agreement with the theoretical prediction 7=1/2 
and for other generators we get qualitatively the same results. 
Therefore we conclude that there are no interstream correla- 
tions detectable by the Sn test. 



6.3. Birthday spacings test 

The birthday spacings test is an empirical test procedure that 
was proposed by Marsaglia It is specifically designed 

to detect the hyperplane structures of LFSR sequences 13911 . 

For the birthday spacings test days < c/, < M are uni- 
formly and randomly chosen in a year of M days. After sort- 
ing these independent identically distributed random numbers 
into nondecreasing order b(^i), b(^2) i • • • i ^(w) ' ^ spacings 
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SN =fe(i)+M-/7(^) 

are defined. The birthday spacings test examines the distri- 
bution of these spacings by sorting them into nondecreasing 
order , i(2) 7 ■ ■ ■ i ^(n) ™d counts the number R of equal spac- 
ings. More precisely, R is the number of indices j such that 

The test statistics is a random 



1 <;■ <A?and 
number with distribution 



PR{r) 



eiV3/(4M)^, 



-e\ - 



(23) 



For N ^ 00 and M ^ 00 such that /{AM) ^ A = const., 
PR{r) converges to a Poisson distribution with mean X. Af- 
ter repeating the birthday spacings test several times one can 
apply a chi-square test to compare the empirical distribution 
of R to the correct distribution ( I23> . If the p-value of the chi- 
square test is very close to one or zero, the birthday spacings 
test has to be considered failed (H. 

In order to make the birthday spacings test sensitive to the 
hyperplane structure of linear sequences, the birthdays are 
arranged in a li-dimensional cube of length I, i. e., M — l'^. 
Each day b, is determined by a li-tuple of consecutive PRNs 
{rdi-,rdi+\,- ■ -Trdi+d-x). 



d-l 

7=0 



/ ■ rdi+ j 



(24) 



Figure 8: Due to their lattice structure LFSR generators fail the birth- 
day spacings test. The nonlinear mapping of YARN sequences is 
an effective technique to destroy these lattice structures. Top: mean 
number {R) of equal spacings between M randomly chosen birthdays. 
For a good PRNG = 1 is expected. Bottom: gray coded p- 

value of a chi-square test for the distribution of R. 



This bijective mapping transforms points in a ^f -dimensional 
space [0, 1 , . . . , Z — 1]'' to the linear space [0, 1 , . . . ,M — 1] and 
points on regular hyperplanes are transformed into regular 
spacings between points in [0, 1, . . . ,M— 1]. 

To demonstrate the failure of LFSR sequences in the birth- 
day spacings test we applied the test to the LFSR sequence 

= 17384 •r,_i + 12391 •r,_2 mod65521 (25) 
and its YARN counterpart 



_ f 20009'?' mod 65 52 1 if ^/ > 
'^'^[0 if^,=0 
qt = 17384 • + 12391 • ^,_2 mod 65521 



with 



(26) 



with the test parameters A = /{AM) w 1 and d = 6. Beyond 
a certain value of M the LFSR sequence starts to fail the birth- 
day spacings test due to its hyperplane structure, see Figure|8] 
The hyperplane structure of LFSR sequences give rise to birth- 
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day spacings that are much more regular than randomly cho- 
sen birthdays. As a consequence we observe a value R that is 
too large, see the top of Figure|8] 

The nonlinear transformation (I15> in the YARN sequences 
destroys the hyperplane structure. In fact, even those YARN 
sequences whose underlying linear part fails the birthday spac- 
ings test, passes the same test with flying colors (Figure |8}. 
YARN generators start to fail the birthday spacings test only 
for trivial reasons, i. e., when M gets close to the period of 
the generator. For the experiment shown in Figure Owe av- 
eraged over 5000 realizations of the M birthdays, and each 
birthday was determined hy d ~6 random numbers. Then we 
expect that the YARN generator starts to fail if M • 6 • 5 000 ^ 
655212- 1, i.e., M>2i^. 

Of course the period of the PRNGs j25> and iibi is artifi- 
cially small and the birthday spacings test was carried out with 
these particular PRNGs to make the effect of hyperplane struc- 
tures and its elimination by delinearization as explicit as pos- 
sible. "Industrially sized" LFSR and YARN generators like 
those in Table U share the same structural features, however. 
So the results for PRNGs ( 125 > and ( I26t can be assumed to 
present generic results for LFSR and YARN sequences, re- 
spectively. 



7. Conclusions 

We have reviewed the problem of generating pseudorandom 
numbers in parallel environments. Theoretical and practical 
considerations suggest to focus on linear recurrences in prime 
number fields, also known as LFSR sequences. These se- 
quences can be efficiently paralleHzed by block splitting and 
leapfrogging, and both methods enable Monte Carlo simula- 
tions to play fair, i. e., to yield results that are independent 
of the degree of parallelization. We also discussed theoreti- 
cal and practical criteria for the choice of parameters in LFSR 
sequences. We then introduced YARN sequences, which are 
derived from LFSR sequences by a bijective nonlinear map- 
ping. A YARN sequence inherits the pseudonoise properties 
from its underlying LFSR sequence, but its linear complexity 
is that of a true random sequence. YARN sequences share all 
the advantages of LFSR sequences, but they pass all tests that 
LFSR sequences tend to fail due to their low linear complex- 
ity. 

All our results have been incorporated into TRNG, a pub- 
licly available software library of portable, parallelizable pseu- 
dorandom number generators. TRNG complies with the pro- 
posal for the next revision of the ISO C++ standard 14^1 . 
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A. Sophie-Germain prime moduli 

The maximal period of a LFSR sequence ({3} over the prime 
field F,„ is given by m" — 1 . We are looking for primes m such 
that m" — 1 has for a fixed « a small number of prime factors. 

For m > 2 the factor m — 1 is even and the smallest possible 
number of prime factors of m" — 1 is three, 

m"-l=2-'^^—^-{l+m + m^+... + m"'-^). (Al) 

If (m — l)/2 is also prime, m is called a Sophie-Germain prime 
or safe prime. For m = 2 there exist values for n such that 2" — 
1 is also prime, primes of form 2" — 1 are called Mersenne 
primes, e. g., 2^ — 1 and 2^^^^^ — 1 are prime. 

If n is composite and m > 2 the period T is a product of at 
least four prime factors, 

»/■' - 1 = (m* - 1) • + (m^')'-2 + . . . + 1] 

= (OT-l)-(m'^"'+;7/"' + . .. + !)• (A2) 
[{m'^y-^ + im'')'-' + ... + !]. 

Actually, the number of prime factors for composite n is even 
larger. The factorization of m" — 1 is related to the factoriza- 
tion of the polynomial 

/„(x)=x"-l (A3) 

over Z. This polynomial can be factored into cyclotomic poly- 
nomials <I>j:(x) I5al . 

/„W=x"-l W- (A4) 

cl\n 

The coefficients of the cyclotomic polynomials are all in Z; 
so the number of prime factors of the factorization of m" — 1 
is bounded from below by the number of cyclotomic polyno- 
mials in the factorization of x" — I. This number equals the 
number of natural numbers that divide «. If « is prime then 
/„ (x) is just the product 4>i (x) ■ <!>„ (x). 

If m is a prime larger than two, from <t>i (x) = x — 1 it follows 
that m" — 1 can be factorized into at least three factors, namely 

m"-l=2.^. n ^^H- (A5) 

d>l;d\n 

The period of a maximal period LFSR sequence with linear 
complexity n over F„, is a product of exactly three factors, 
if and only if m is a Sophie-Germain prime, ii is prime, and 
<I>„(m) is prime also. Note, 'I>2(»j) = m + 1 is never prime, if 
m is an odd prime. Let us investigate some special cases in 
more detail. 

Case 11 = 2: The period n-? — 1 is a product of 2^ • 3 and at 
least two other factors. Using the sieve of Eratosthenes 
modulo 12 it can be shown, that each large enough prime 
m can be written as m = 12 • ^ + c, where k and c are 
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Table II 


1: A collection of Sophie-Germain primes 


/77 for which ni!^ — 1 


has a minimal number of prime factors. 




n 


m n 


tn 


1 


2^1-525 5 


2^1-46365 




2^1 -69 


2^1 -22641 




2''3-5781 


263-594981 




2''3-4569 


2^3- 19581 


2 


231 -37485 6 


231 _ 4398621 




2^1 -2085 


2-31 - 1 120941 




2''3 -927861 


2''3 - 122358381 




2*'^- 156981 


2^3 _ 29342085 


3 


2^1-43725 7 


231 _ 50949 




2^1 -21069 


231 _5489 




2^3-275025 


2^3 _ 92 181 




2^3 _ 21 129 


263 _ 52425 


4 


2^1-305829 






2^1- 119565 






2''3 -3228621 






2" - 156981 





integers such that gcd(12,c) = 1. Factoring the period 
ot2-1 = (12-yt + c)2-l we find 

(12-yt+l)2-l=23-3-fc-(6/t+l) 
(12-A: + 5)2-l =2^-3-(2fe+l)-(3/t+l) 
(12-A: + 7)2 - 1 =23-3-(3fe + 2)-(2A:+l) 
(12-yt+ll)2- 1 =2^-3-(;t+l)-(6/t + 5). 

Case 11= A: The period — 1 is a product of 2^1 • 3 • 5 and at 
least three other factors. Using the sieve of Erastosthenes 
modulo 60 it can be shown, that each large enough prime m 



can be written as m = 60 • + c, where k and c are integers 
such that gcd(60,c) = 1. Factoring the period ni^ — 1 = 
(60-/t + c)2- 1 we find 

(60 • /t + 1 - 1 = 2"* • 3 • 5 • A: • (30A: + 1 ) 
•(1800/fc2 + 60/t+l) 

(60-fc + 7)'^-l =2'*- 3-5 -(lOfc+l) -(15^ + 2) 
■{36Qk^ + Uk + 5) 

{6Q-k+\\f-\=2'^-3-5-{5k+\)-{6k+\) (A7) 
•(1800/t2 + 660/t + 61) 

(60 • + 1 3 - 1 = 2"* • 3 • 5 • (5/t + 1 ) • (30/t + 7) 
•(360/t2 + 156/t+17) 
and so on . . . 

Case n = 6: This case is similar to the n = 2 and n = 4 cases. 
Applying the sieve of Erastosthenes modulo 84 it can be 
shown that — 1 is a product of 23 • 3^ • 7 and at least four 
other factors. 

Cases « = 3,«=5,« = 7: Here n is prime, and therefore the 
period m" — 1 has at least three factors. The number of 
factors of m" — 1 will not exceed three, if m is a Sophie- 
Germain prime and <t>„ (m) is prime, too. 

In Tablelllllwe present a collection of Sophie-Germain primes 
m, for which m" — 1 has a minimal number of prime factors. 
For n prime an extended table can be found in l56ll . If n is 
prime, its factorization can be found by (IA4> . and if n = 2 or 
« = 4 by iA6\ and ( IA7> . respectively. All these primes are 
good candidates for moduli of LFSR sequences as PRNGs in 
parallel applications. Note that the knowledge of the factoriza- 
tion of m" — 1 is essential for an efficient test of the primitivity 
of characteristic polynomials. 
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